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3 1 . (Amended) The system for determining a title from a document image according to 
claim 23 wherein said natural language analysis unit generates an analysis result on at least one 
criterion of said multiple criteria indicating whether or not said characters end in a noun form. 

32. (Amended) The system for determining a title from a document image according to 
claim 23 wherein said natural language analysis unit generates an analysis result on at least one 
criterion of said multiple criteria indicating whether or not said characters end in a set of 
predetermined suffixes. 

33. (Amended) The system for determining a title from a document image according to 
claim 20 wherein said character row area determination unit generates at least one criterion of 
said multiple criteria based on a ratio between a length and a height of each of said 
circumscribing rectangles. 

34. (Amended) The system for determining a title from a document image according to 
claim 20 wherein said character row area determination unit generates at least one criterion of 
said multiple criteria based on a ratio between a summed width of said characters and a 
corresponding one of said circumscribing rectangles. 



A copy of the claims, with amendments marked, appears in Appendix 3. 

REMARKS 
The Rejections under 35 USC §102 

The Examiner rejected claims 1 and 19 under 35 USC §102 as allegedly being anticipated 
by Abe. Although Applicant does not necessarily agree with all of the Examiner's allegations, 
independent claims land 19 have been amended to clarify the patentable features of the current 
invention. 

Newly amended independent claims 1 and 19 now each explicitly recite "a single value 

based upon multiple criteria obtained during said character recognition and said title evaluation 

. . .." In other words, the newly amended independent claims recite that the decision of whether 

or not a given minimum circumscribing rectangle contains the title of a document depends on 
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one quantity or a single value. This single value represents one quantity that is assigned based 
on a combination of two or more individual criteria. 



The above claim amendments are supported by the original disclosures of the current 
application. See, for example, Figure 4, specification page 10 at lines 10-22, and the claims as 
originally filed. For this reason, Applicant respectfully submits that the claim amendments do 
not add any new matter to the current application. 



In contrast to the above explicit recitation of the patentable features, the Abe reference 
describes determining a title based upon individual criteria, such as font size, or the position of a 
font of a particular size on a given page. Despite these disclosures, the cited reference fails to 
anticipate the condition of "a single value based upon multiple criteria" as explicitly recited in 
newly amended independent claims 1 and 19. Although the Abe reference sets forth a method of 
identifying a title based upon individual pieces of information, the cited reference lacks any 
disclosure to anticipate combining the individual criteria to produce a single quantity to express 
the likelihood that a given string of text includes a title. 

Based upon the above clear patentable distinction, Applicant respectfully submits to the 
Examiner that the Abe reference fails to anticipate every element of newly amended independent 
claims 1 and 19. Therefore, Applicant further respectfully submits to the Examiner that the 
pending section 102 rejections should be withdrawn upon reconsideration. 

The Rejections under 35 USC §103 

The Examiner has rejected claims 2-18 and 20-36 as allegedly obvious. Claims 2-12, 
15-18, 20-30, and 33-36 are rejected over Abe in light of Katsuyama, and claims 13, 14, 31, and 
32 are rejected over Abe in view of Katsuyama, further in view of Chen. 

As discussed above, the Abe reference sets forth a method of identifying a title based 
upon individual pieces of information such as font size, or the position of a font of a particular 
size on a given page. The cited reference, however, lacks any disclosure or suggestion for 
combining the individual criteria to produce a single quantity to express the likelihood that a 
given string of text includes a title. 

6 



Docket No. RCOH-1020 W 



PATENT 



The Katsuyama reference describes a method of title extraction using multiple criteria to 
which points are assigned; however, this reference contains no teaching or suggestion regarding 
natural language. 

Finally, Chen describes certain techniques for identifying user-defined keywords without 
resolving the individual characters. In Chen, prefixes and suffixes of the user-defined keywords 
are identified by whether they contain characters that extend to the top of a line of text or 
descend below a line of text. In other words, the Chen reference relies on the character image 
and fails to rely upon natural language information. Furthermore, the Chen reference fails to 
teach or suggest how the techniques might be applied to title extraction. 

Applicant respectfully submits that newly amended independent claims 1 and 19 are not 
obvious in light of the cited references. With respect to the rejection over Abe in view of 
Katsuyama, newly amended claims 1 and 1 9 respectively recite a method and a system of 
"determining a title from a document image." Both the newly amended method and the system 
claims require that the title be determined by "a single value based upon multiple criteria", 
wherein natural language likelihood is always required as one of the criteria. Natural language 
factors exemplified in the specification include identifying specific words, such as "title:" or 
"re:", or determining whether a word is in noun form, or whether it ends in a certain suffix. 
Neither Abe nor Katsuyama contains any teaching or suggestion regarding natural language 
likelihood. The cited references, therefore, fail to teach, disclose, or suggest every element of 
Applicant's claimed invention. 

With respect to the rejection over Abe in view of Katsuyama and Chen, Applicant 
respectfully submits that these references neither teach not suggest the combination of elements 
in the claims as amended herein. First, neither Abe nor Katsuyama includes any teaching about 
natural language, and Chen includes no teaching about title extraction. Moreover, Chen teaches 
away from the present invention, because the purpose of Chen's invention is to recognize 
keywords without recognizing the individual characters or words. By contrast, Applicant's 
independent claims 1 and 19 respectively require a character recognition step and a character 
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recognition unit. In attempt to provide Applicant's invention, the Chen fails to provide any 
suggestion for combining Abe and Katsuyama. 

Applicant therefore respectfully submits that the cited references do not contain any 
suggestion or motivation to combine the elements as claimed in the present application. Because 
of the lack of the necessary suggestion, Applicant further respectfully submits that the Chen 
reference may not properly be cited in an rejection of Applicant's claims under 35 USC § 103. 

Based upon the above discussed patentable features of the newly amended independent 
claims, Applicant respectfully submits that it would not have been obvious to one of ordinary 
skill to provide the patentable features of newly amended independent claims 1 and 19 based 
upon the cited references alone or in combination. 

All of the dependent claims currently pending in the present application ultimately 
depend from newly amended independent claim 1 or 19, and therefore incorporate the above 
discussed patentable features of the newly amended independent claims. Because newly 
amended independent claims 1 and 19 are not obvious over the cited references, dependent 
claims 3-18 and 21-36 are also not obvious. Therefore, the Applicant respectfully submits that 
the pending section 103 rejections should be withdrawn upon reconsideration. 

Other Amendments 

The specification has been amended to further clarify two incorporations by reference, 
and also to correct an apparent typographical error in the detailed description. 

Claims 2, 3, 6, 7, 9, 1 1, 13-16, 20, 21, 25, 27, and 31-34 have been amended to 
compensate for the changes in antecedent nouns that result from the present amendments to 
claims 1 and 19. 

In addition, claims 33 and 34 are amended to rectify an apparent typographical error. 
Because claims 33 and 34 are both directed to systems, rather than methods, they depend from 
claim 20, rather than claim 2. 



8 



Docket No. RCOH-1020 




PATENT 



Claims 4, 12 and 30 have been amended to further clarify their scope. 

The above amendments are purely formal. Accordingly, Applicant believes that these 
amendments do not introduce new matter into the application. Furthermore, Applicant believes 
that the above amendments do not change the scope of the claims 



In view of the above amendments and the foregoing remarks, Applicant respectfully 
submits that all of the pending claims are in condition for allowance and respectfully requests a 
favorable Office Action so indicating. 



KNOBLE & YOSHIDA LLC 
Eight Penn Center, Suite 1350 
1628 John F. Kennedy Blvd. 
Philadelphia, PA 19103 
(215) 599-0600 

Appendices: 1 . Replacement Paragraphs of the Specification 

2. Replacement Paragraphs of the Specification, with Amendments Marked 

3. Claims with Amendments Marked 



Conclusion 



Respectfully submitted, 



Date: October 14, 2002 
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Appendix 2: Replacement Paragraphs of the Specification, with Amendments Marked 

/ Paragraph beginning on page 4 at line 6: 

Now referring to FIGURE 4, a block diagram illustrates a third preferred embodiment of 
the system for determining a title from a document image according to the current invention. An 
image input unit 121 inputs a document image, and a document image storage unit 122 stores the 
inputted document image. A character row area determination unit 123 determines areas or 
minimal circumscribing rectangles that contain characters. The character row area determination 
unit 123 outputs the coordinates as well as the size of character row areas to a character 
recognition unit 124 as well as a title evaluation point determination unit 128. The character 
recognition unit 124 recognizes characters from character image portions in the character row 
areas. fef -A reference describing - fee character recognition, disclosur e s in U.S. Pat. No. 
5,966,464 are her e by , is incorporated by e xternal re f e r e nc e d reference herein in its entirety . The 
character recognition unit 124 generates corresponding character codes as well as other 
associated information. Other associated information includes the character recognition 
assurance level, the coordinates of a minimal circumscribing rectangle and the size of the 
rectangle. The outputs from the character recognition unit 124 are sent to a font determination 
unit 125, the title evaluation point determination unit 128, a natural language analysis unit 126 
and a recognition result storage unit 129. The font determination unit 125 determines a font type 
and other associated information for each character and outputs the font information to the title 
evaluation point determination unit 128. A reference describing Disclosures on th e font 
determination, te~ Japanese Patent Laid Publication Hei 9-319830, is are h e r e by incorporated by 
reference herein in its entirct y external r e ferenc e d . The natural language analysis unit 126 
compares the recognized characters against a predetermined dictionary and determines whether 
or not the recognized characters match or resemble any of the predetermined titles or words in a 
dictionary. For example, the dictionary contains a set of predetermined suffixes which indicate a 
noun form and its corresponding statistical information. The natural language analysis unit 126 
also outputs the determination information to the title evaluation point determination unit 128. 
A characteristics extraction unit 127 extracts information on certain layouts such as underlining, 
centering and the minimal circumscribing rectangle size from the input image and outputs the 
information to the title evaluation point determination unit 128. For example, if the character 
size is beyond 18-point in an A4 image, the minimal circumscribing rectangle containing the 
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characters is assigned a high score. Similarly, a high score is assigned to a minimal 
circumscribing rectangle if a number of characters or words in the rectangle is less than a 
predetermined number. For example, for the Japanese language, the predetermined number of 
characters may be set to twelve. The above and other predetermined numbers are user-definable. 

Paragraph beginning on page 9 at line 29: 

FIGURE 7 illustrates other acts involved in determining the likelihood based upon a 
number of characters according to the current invention. In act A401, a document image is 
inputted, and character row areas are determined in act A402. After the character image in the 
character row areas is converted into character codes, a number of characters is determined. The 
number of characters is compared to a predetermined threshold value in act A404. A set of 
predetermined threshold values is optionally stored in a statistical dictionary for different types 
of documents. If the number of characters is below the predetermined threshold value in act 
A405, a predetermined number of points is added to the likelihood for the character row area and 
a title area selection is determined based upon the total number of points in act A406. On the 
other hand, if the number of characters is below - above the predetermined threshold value in act 
A405, other predetermined processing is performed. 
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Appendix 3: Claims with Amendments Marked 

1 . (Amended) A method of determining a title from a document image, comprising: 

dividing the document image into minimal circumscribing rectangles which contain a 
character image; 

recognizing characters in said minimal circumscribing rectangles; and 
determining a title of the document image based upon a likelihood of each of said 
minimal circumscribing rectangles containing a title, said likelihood being determined by a 
single value based upon information multiple criteria obtained during said character recognition 
and said title detennination, said multiple criteria comprising natural language likelihood and any 
combination of character row area coordinates, character type, number of characters, character 
code assurance, character minimum circumscribing rectangle coordinates, and character 
minimum circumscribing rectangle size . 

2. (Amended) The method of determining a title from a document image according to 
claim 1 wherein said likelihood is expressed in single value incl udes a sum of points based on 
said multiple criteria information . 

3. (Amended) The method of determining a title from a document image according to 
claim 2 wherein said information includes multiple criteria include characteristics on font. 

4. (Amended) The method of determining a title from a document image according to 
claim 2 wherein said font characteristics is d e termin e d on include a frequency of a particular font 
type. 

6. (Amended) The method of determining a title from a document image according to 
claim 5 wherein said multiple criteria include information includes a result of said matching with 
said predetermined words. 

7. (Amended) The method of determining a title from a document image according to 
claim 2 wherein said multiple criteria include information includ e s a number of said characters. 
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9. (Amended) The method of determining a title from a document image according to 
claim 2 wherein said multipl e cri teria include information includes an assurance level of said 
character recognition. 

1 1 . (Amended) The method of determining a title from a document image according to 
claim 2 wherein said multiple criteria include information includes layout characteristics. 

12. (Amended) The method of determining a title from a document image according to 
claim 1 1 wherein said information includes layout characteristics include centering, underlining, 
aftd-siz e, or any combination thereof . 

13. (Amended) The method of determining a title from a document image according to 
claim 2 wherein said information indicates multiple criteria include an indication of whether or 
not said characters end in a noun form. 

14. (Amended) The method of determining a title from a document image according to 
claim 2 wherein said multiple criteria include an indication of information indicat e s whether or 
not said characters end in a set of predetermined suffixes. 

15. (Amended) The method of determining a title from a document image according to 
claim 2 wherein said multiple criteria include information includ e s a ratio between a length and a 
height of each of said circumscribing rectangles. 

16. (Amended) The method of determining a title from a document image according to 
claim 2 wherein said multiple criteria include information includes a ratio between a summed 
width of said characters and a corresponding one of said circumscribing rectangles. 

19. (Amended) A system for determining a title from a document image, comprising: 

a character row area determination unit for dividing the document image into minimal 

circumscribing rectangles which contain a character image; 

a character recognition unit connected to said character row area determination unit for 

recognizing characters in said minimal circumscribing rectangles; and 
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a title evaluation point determination unit connected to said character recognition unit 
for determining a title of the document image based upon a likelihood of each of said minimal 
circumscribing rectangles containing a title, said likelihood being determined by a single value 
based upon information multiple cri teria obtained during said character recognition and said title 
evaluation, said multiple criteria comprising natural language likelihood and any combination of 
character row area coordinates, character type, number of characters, character code assurance, 
character minimum circumscribing rectangle coordinates, and character minimum circumscribing 
rectangle size . 

20. (Amended) The system for determining a title from a document image according to 
claim 19 wherein said title evaluation point determination unit determines said likelihood single 
value in terms of a sum of points based on_said multiple criteri a information . 

2 1 . (Amended) The system for determining a title from a document image according to 
claim 20 wherein said title evaluation point determination unit further comprises a font 
determination unit for generating information at least one criterion of said multiple criteria on 
font characteristics. 

25. (Amended) The system for determining a title from a document image according to 
claim 20 wherein said character recognition unit generates said information at least one criterion 
of said multiple criteria based on a number of said characters. 

27. (Amended) The system for determining a title from a document image according to 
claim 20 wherein said character recognition unit generates at least one criterion of said multiple 
criteria based said information on an assurance level of said character recognition. 

30. (Amended) The system for determining a title from a document image according to 
claim 29 wherein said extraction unit extracts said layout characteristics on centering, 
underlining-attd, siz e, or any combination thereof . 

3 1 . (Amended) The system for determining a title from a document image according to 

claim 23 wherein said natural language analysis unit generates an analysis result on at least one 
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criterion of said multiple criteria said information indicating whether or not said characters end in 
a noun form. 

32. (Amended) The system for determining a title from a document image according to 
claim 23 wherein said natural language analysis unit generate s an analysis result on at least one 
criterion of said multiple criteria said information indicating whether or not said characters end in 
a set of predetermined suffixes. 

33. (Amended) The system for determining a title from a document image according to 
claim 20 S wherein said character row area determination unit generates at least one criterion of 
said multiple criteria based said information on a ratio between a length and a height of each of 
said circumscribing rectangles. 

34. (Amended) The system for determining a title from a document image according to 
claim 20 3 wherein said character row area determination unit generates at least one criterion of 
said multiple criteria based said information on a ratio between a summed width of said 
characters and a corresponding one of said circumscribing rectangles. 
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